2 research outputs found

    Adaptive Multimodal Emotion Detection Architecture for Social Robots

    Get PDF
    Emotion recognition is a strategy for social robots used to implement better Human-Robot Interaction and model their social behaviour. Since human emotions can be expressed in different ways (e.g., face, gesture, voice), multimodal approaches are useful to support the recognition process. However, although there exist studies dealing with multimodal emotion recognition for social robots, they still present limitations in the fusion process, dropping their performance if one or more modalities are not present or if modalities have different qualities. This is a common situation in social robotics, due to the high variety of the sensory capacities of robots; hence, more flexible multimodal models are needed. In this context, we propose an adaptive and flexible emotion recognition architecture able to work with multiple sources and modalities of information and manage different levels of data quality and missing data, to lead robots to better understand the mood of people in a given environment and accordingly adapt their behaviour. Each modality is analyzed independently to then aggregate the partial results with a previous proposed fusion method, called EmbraceNet+, which is adapted and integrated to our proposed framework. We also present an extensive review of state-of-the-art studies dealing with fusion methods for multimodal emotion recognition approaches. We evaluate the performance of our proposed architecture by performing different tests in which several modalities are combined to classify emotions using four categories (i.e., happiness, neutral, sadness, and anger). Results reveal that our approach is able to adapt to the quality and presence of modalities. Furthermore, results obtained are validated and compared with other similar proposals, obtaining competitive performance with state-of-the-art models

    Emotion Detection for Social Robots Based on NLP Transformers and an Emotion Ontology

    Get PDF
    For social robots, knowledge regarding human emotional states is an essential part of adapting their behavior or associating emotions to other entities. Robots gather the information from which emotion detection is processed via different media, such as text, speech, images, or videos. The multimedia content is then properly processed to recognize emotions/sentiments, for example, by analyzing faces and postures in images/videos based on machine learning techniques or by converting speech into text to perform emotion detection with natural language processing (NLP) techniques. Keeping this information in semantic repositories offers a wide range of possibilities for implementing smart applications. We propose a framework to allow social robots to detect emotions and to store this information in a semantic repository, based on EMONTO (an EMotion ONTOlogy), and in the first figure or table caption. Please define if appropriate. an ontology to represent emotions. As a proof-of-concept, we develop a first version of this framework focused on emotion detection in text, which can be obtained directly as text or by converting speech to text. We tested the implementation with a case study of tour-guide robots for museums that rely on a speech-to-text converter based on the Google Application Programming Interface (API) and a Python library, a neural network to label the emotions in texts based on NLP transformers, and EMONTO integrated with an ontology for museums; thus, it is possible to register the emotions that artworks produce in visitors. We evaluate the classification model, obtaining equivalent results compared with a state-of-the-art transformer-based model and with a clear roadmap for improvement
    corecore